AITopics | pruning process

Collaborating Authors

pruning process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

15de21c670ae7c3f6f3f1f37029303c9-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 20:33:06 GMT

artificial intelligence, machine learning, urban100, (17 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

C-SWAP: Explainability-Aware Structured Pruning for Efficient Neural Networks Compression

Bauvin, Baptiste, Baret, Loïc, Ahmad, Ola

arXiv.org Artificial IntelligenceOct-22-2025

Neural network compression has gained increasing attention in recent years, particularly in computer vision applications, where the need for model reduction is crucial for overcoming deployment constraints. Pruning is a widely used technique that prompts sparsity in model structures, e.g. weights, neurons, and layers, reducing size and inference costs. Structured pruning is especially important as it allows for the removal of entire structures, which further accelerates inference time and reduces memory overhead. However, it can be computationally expensive, requiring iterative retraining and optimization. To overcome this problem, recent methods considered one-shot setting, which applies pruning directly at post-training. Unfortunately, they often lead to a considerable drop in performance. In this paper, we focus on this issue by proposing a novel one-shot pruning framework that relies on explainable deep learning. First, we introduce a causal-aware pruning approach that leverages cause-effect relations between model predictions and structures in a progressive pruning process. It allows us to efficiently reduce the size of the network, ensuring that the removed structures do not deter the performance of the model. Then, through experiments conducted on convolution neural network and vision transformer baselines, pre-trained on classification tasks, we demonstrate that our method consistently achieves substantial reductions in model size, with minimal impact on performance, and without the need for fine-tuning. Overall, our approach outperforms its counterparts, offering the best trade-off. Our code is available on GitHub.

artificial intelligence, machine learning, neuron, (19 more...)

arXiv.org Artificial Intelligence

2510.18636

Country: North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

STADE: Standard Deviation as a Pruning Metric

Mecke, Diego Coello de Portugal, Alyoussef, Haya, Koloiarov, Ilia, Stubbemann, Maximilian, Schmidt-Thieme, Lars

arXiv.org Artificial IntelligenceMar-28-2025

Recently, Large Language Models (LLMs) have become very widespread and are used to solve a wide variety of tasks. To successfully handle these tasks, LLMs require longer training times and larger model sizes. This makes LLMs ideal candidates for pruning methods that reduce computational demands while maintaining performance. Previous methods require a retraining phase after pruning to maintain the original model's performance. However, state-of-the-art pruning methods, such as Wanda, prune the model without retraining, making the pruning process faster and more efficient. Building upon Wanda's work, this study provides a theoretical explanation of why the method is effective and leverages these insights to enhance the pruning process. Specifically, a theoretical analysis of the pruning problem reveals a common scenario in Machine Learning where Wanda is the optimal pruning method. Furthermore, this analysis is extended to cases where Wanda is no longer optimal, leading to the development of a new method, STADE, based on the standard deviation of the input. From a theoretical standpoint, STADE demonstrates better generality across different scenarios. Finally, extensive experiments on Llama and Open Pre-trained Transformers (OPT) models validate these theoretical findings, showing that depending on the training conditions, Wanda's optimal performance varies as predicted by the theoretical framework. These insights contribute to a more robust understanding of pruning strategies and their practical implications. Code is available at: https://github.com/Coello-dev/STADE/

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.22451

Country:

Europe > Germany > Lower Saxony (0.05)
North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
Europe > Portugal (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MaskPrune: Mask-based LLM Pruning for Layer-wise Uniform Structures

Qin, Jiayu, Tan, Jianchao, Zhang, Kefeng, Cai, Xunliang, Wang, Wei

arXiv.org Artificial IntelligenceFeb-19-2025

The remarkable performance of large language models (LLMs) in various language tasks has attracted considerable attention. However, the ever-increasing size of these models presents growing challenges for deployment and inference. Structured pruning, an effective model compression technique, is gaining increasing attention due to its ability to enhance inference efficiency. Nevertheless, most previous optimization-based structured pruning methods sacrifice the uniform structure across layers for greater flexibility to maintain performance. The heterogeneous structure hinders the effective utilization of off-the-shelf inference acceleration techniques and impedes efficient configuration for continued training. To address this issue, we propose a novel masking learning paradigm based on minimax optimization to obtain the uniform pruned structure by optimizing the masks under sparsity regularization. Extensive experimental results demonstrate that our method can maintain high performance while ensuring the uniformity of the pruned model structure, thereby outperforming existing SOTA methods.

arxiv preprint arxiv, language model, pruning, (14 more...)

arXiv.org Artificial Intelligence

2502.14008

Country:

North America > United States > Virginia (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Pruning for Sparse Diffusion Models based on Gradient Flow

Wan, Ben, Zheng, Tianyi, Chen, Zhaoyu, Wang, Yuxiao, Wang, Jia

arXiv.org Artificial IntelligenceJan-16-2025

Diffusion Models (DMs) have impressive capabilities among generation models, but are limited to slower inference speeds and higher computational costs. Previous works utilize one-shot structure pruning to derive lightweight DMs from pre-trained ones, but this approach often leads to a significant drop in generation quality and may result in the removal of crucial weights. Thus we propose a iterative pruning method based on gradient flow, including the gradient flow pruning process and the gradient flow pruning criterion. We employ a progressive soft pruning strategy to maintain the continuity of the mask matrix and guide it along the gradient flow of the energy function based on the pruning criterion in sparse space, thereby avoiding the sudden information loss typically caused by one-shot pruning. Gradient-flow based criterion prune parameters whose removal increases the gradient norm of loss function and can enable fast convergence for a pruned model in iterative pruning stage. Our extensive experiments on widely used datasets demonstrate that our method achieves superior performance in efficiency and consistency with pre-trained models.

arxiv preprint arxiv, gradient flow, pruning, (13 more...)

arXiv.org Artificial Intelligence

2501.09464

Country:

Asia > China > Shanghai > Shanghai (0.06)
Europe > Switzerland (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Decay Pruning Method: Smooth Pruning With a Self-Rectifying Procedure

Yang, Minghao, Gao, Linlin, Li, Pengyuan, Li, Wenbo, Dong, Yihong, Cui, Zhiying

arXiv.org Artificial IntelligenceJun-6-2024

Deep Neural Networks (DNNs) have been widely used for various applications, such as image classification [22; 40], object segmentation [33; 35], and object detection [6; 43]. However, the increasing size and complexity of DNNs often result in substantial computational and memory requirements, posing challenges for deployment on resource-constrained platforms, such as mobile or embedded devices. Consequently, developing efficient methods to reduce the computational complexity and storage demands of large models, while minimizing performance degradation, has become essential. Network pruning is one of the most popular methods in model compression. Specifically, current network pruning methods are categorized into unstructured and structured pruning [5]. Unstructured pruning [11; 24] focuses on eliminating individual weights from a network to create fine-grained sparsity. Although these approaches achieve an excellent balance between model size reduction and accuracy retention, they often require specific hardware support for acceleration, which is impractical for general-purpose computing environments. Conversely, structured pruning [23; 18; 29] avoids these hardware dependencies by eliminating redundant network structures, thus introducing a more manageable and hardware-compatible form of sparsity. As a result, structured pruning has become popular and is extensively utilized.

decay pruning method, pruning, pruning process, (11 more...)

arXiv.org Artificial Intelligence

2406.03879

Country: Asia > China > Zhejiang Province > Ningbo (0.05)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights

Castells, Thibault, Song, Hyoung-Kyu, Kim, Bo-Kyeong, Choi, Shinkook

arXiv.org Artificial IntelligenceApr-18-2024

Latent Diffusion Models (LDMs) have emerged as powerful generative models, known for delivering remarkable results under constrained computational resources. However, deploying LDMs on resource-limited devices remains a complex issue, presenting challenges such as memory consumption and inference speed. To address this issue, we introduce LD-Pruner, a novel performance-preserving structured pruning method for compressing LDMs. Traditional pruning methods for deep neural networks are not tailored to the unique characteristics of LDMs, such as the high computational cost of training and the absence of a fast, straightforward and task-agnostic method for evaluating model performance. Our method tackles these challenges by leveraging the latent space during the pruning process, enabling us to effectively quantify the impact of pruning on model performance, independently of the task at hand. This targeted pruning of components with minimal impact on the output allows for faster convergence during training, as the model has less information to re-learn, thereby addressing the high computational cost of training. Consequently, our approach achieves a compressed model that offers improved inference speed and reduced parameter count, while maintaining minimal performance degradation. We demonstrate the effectiveness of our approach on three different tasks: text-to-image (T2I) generation, Unconditional Image Generation (UIG) and Unconditional Audio Generation (UAG). Notably, we reduce the inference time of Stable Diffusion (SD) by 34.9% while simultaneously improving its FID by 5.2% on MS-COCO T2I benchmark. This work paves the way for more efficient pruning methods for LDMs, enhancing their applicability.

latent representation, operator, pruning, (14 more...)

arXiv.org Artificial Intelligence

2404.11936

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

In to Decision Trees Part: 2. Hi! Hello and thanks for reading this…

#artificialintelligenceDec-5-2022, 20:45:06 GMT

If you missed the previous part of this blog "In to Decision Trees Part: 1", please visit it. In this blog, we will explore more about Decision Trees Algorithm and its capability. The Process of Decision trees on Numerical (Discrete/Continuous) features is slightly different than categorical features. While the Decision trees can handle categorical variables with ease. There are two types of categorical values.

categorical feature, decision tree part, overfitting, (7 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

A Fair Loss Function for Network Pruning

Meyer, Robbie, Wong, Alexander

arXiv.org Artificial IntelligenceNov-18-2022

Model pruning can enable the deployment of neural networks in environments with resource constraints. While pruning may have a small effect on the overall performance of the model, it can exacerbate existing biases into the model such that subsets of samples see significantly degraded performance. In this paper, we introduce the performance weighted loss function, a simple modified cross-entropy loss function that can be used to limit the introduction of biases during pruning. Experiments using biased classifiers for facial classification and skin-lesion classification tasks demonstrate that the proposed method is a simple and effective tool that can enable existing pruning methods to be used in fairness sensitive contexts.

artificial intelligence, loss function, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.10285

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Dermatology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SBPF: Sensitiveness Based Pruning Framework For Convolutional Neural Network On Image Classification

Lu, Yiheng, Gong, Maoguo, Zhao, Wei, Feng, Kaiyuan, Li, Hao

arXiv.org Artificial IntelligenceAug-9-2022

Pruning techniques are used comprehensively to compress convolutional neural networks (CNNs) on image classification. However, the majority of pruning methods require a well pre-trained model to provide useful supporting parameters, such as C1-norm, BatchNorm value and gradient information, which may lead to inconsistency of filter evaluation if the parameters of the pre-trained model are not well optimized. Therefore, we propose a sensitiveness based method to evaluate the importance of each layer from the perspective of inference accuracy by adding extra damage for the original model. Because the performance of the accuracy is determined by the distribution of parameters across all layers rather than individual parameter, the sensitiveness based method will be robust to update of parameters. Namely, we can obtain similar importance evaluation of each convolutional layer between the imperfect-trained and fully trained models. For VGG-16 on CIFAR-10, even when the original model is only trained with 50 epochs, we can get same evaluation of layer importance as the results when the model is trained fully. Then we will remove filters proportional from each layer by the quantified sensitiveness. Our sensitiveness based pruning framework is verified efficiently on VGG-16, a customized Conv-4 and ResNet-18 with CIFAR-10, MNIST and CIFAR-100, respectively.

accuracy, pruning ratio, sensitiveness, (16 more...)

arXiv.org Artificial Intelligence

2208.04588

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
Europe > Italy > Veneto > Venice (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(15 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback